Why Open Science

Open science is about making the methods, data and outcomes in your analysis available to everyone. It includes:

In this tutorial, you are not going to learn all aspects of open science as listed above. However, you will learn one tool that can be used to make your workflows:

You will learn how to document your work - by connecting data, methods and outputs in one or more reports or documents. You will learn the R Markdown file format which can be used to generate reports that connect your data, code (methods used to process the data) and outputs. You will use the rmarkdown and knitr package to write R Markdown files in Rstudio and publish them in different formats (html, pdf, etc).

About R Markdown

Simply put, .Rmd is a text based file format that allows you to include both descriptive text, code blocks and code output. You can run the code in R using a package called knitr (which you will learn about next). You can export the text formated .Rmd file to a nicely rendered, shareable format like pdf or html. When you knit (or use knitr), the accompanying code is executed, resulting the outputs (e.g. plots, and other figures) appearing in the rendered document.

R Markdown (.Rmd) is an authoring format that enables easy creation of dynamic documents, presentations, and reports from R. It combines the core syntax of markdown (an easy to write plain text format) with embedded R code chunks that are run so their output can be included in the final document. R Markdown documents are fully reproducible (they can be automatically regenerated whenever underlying R code or data changes).“ RStudio documentation.

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunks in your knitr markdown using:

There are also several option that you can add to this fucntion {r} to change how your code runs (e.g. {r, include=FALSE}).

Markdown basics

Now let’s learn additional basics that you can use for creating your markdown documents.

Text

Plain text

End a line with two spaces

to start a new paragraph.

Highlighted text and special characters

italics and bold

verbatim code

sub/superscript22

strikethrough

escaped: * _ \

endash: –, emdash: —

equation: \(A = \pi*r^{2}\)

equation block: \[E = mc^{2}\]

block quote

Header1

Header 2

Header 3

Header 4

Header 5
Header 6

HTML ignored in pdfs

http://www.rstudio.com

link

Jump to Header 1

image: Caption

  • unordered list

  • sub-item 1

  • sub-item 2

  • sub-sub-item 1

  • item 2 Continued (indent 4 spaces)

  1. ordered list
  2. item 2
  1. sub-item 1 A. sub-sub-item 1
  1. A list whose numbering continues after
  2. an interruption

Term 1: Definition 1

Right Left Default Center
12 12 12 12
123 123 123 123
1 1 1 1
  • slide bullet 1
  • slide bullet 2 (>- to have bullets appear on click)

horizontal rule/slide break: *** A footnote [^1] [^1]: Here is the footnote.

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Data and packages

library(ggplot2)
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(tidyverse)
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ✔ purrr   0.3.4      
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
  1. Let’s also grab some data here. This is spatial point dataset that I have collected as part of a project in the Open Spaces and Moutain Parks of Boulder Colorado. It consist of the points where people have taken pictures using Flickr and Panramio. We have also collected several spatial varibles that might explain why individuals might be taking photographs at these points and all other points in park. We will import the data as a sf spatial dataset.
boulder <- st_read("C:/Users/dbvanber/Dropbox (University of Michigan)/Geovis/Labs/Adv_Week_1/BoulderSocialMedia.shp")
## Reading layer `BoulderSocialMedia' from data source 
##   `C:\Users\dbvanber\Dropbox (University of Michigan)\Geovis\Labs\Adv_Week_1\BoulderSocialMedia.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
boulder
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
## First 10 features:
##            id     DB   extent Climb_dist TrailH_Dis NatMrk_Dis Trails_dis
## 1  6517284333 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 2  6517281191 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 3  6517278961 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 4  6517276295 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 5  6517274727 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 6  6517272539 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 7  6517270109 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 8  6516904527 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 9  6516902971 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 10 6516900761 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
##    Bike_dis PrarDg_Dis PT_Elev Hydro_dis Street_dis                geometry
## 1  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 2  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 3  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 4  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 5  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 6  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 7  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 8  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 9  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 10 1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)

This Here are the details of data:

Variable Description
DB indicates whether the point is a social media location (Flickr or Panramio) or a point in the park
extent extent that can be viewed at each point estimated through viewshed analysis
Climb_dist distance to nearest climbing wall
TrailH_Dis distance to hiking trails
NatMrk_Dis distance to natural landmark
Trails_dis distance to walking trails
Bike_dis distance to biking trails
PrarDg_Dis distance to prairie dog mounds
PT_Elev Elevation
Hydro_dis distance to lakes, rivers and creeks
Street_dis distance to streets and parking lots
  1. We can plot these variables using ggplot2. We define the sf data using the geom_sf function. The different arguments control the object attributes(this can be points, lines or polygons). For example, fill= control the color of object outline. alpha = controls the opacity of the object. The final argument is a complete theme, which controls the non-data display(e.g. neatlines, gradicule title). More details can be found regarding these [themes] here(https://ggplot2.tidyverse.org/reference/ggtheme.html). Here we use theme_bw, which is the black and white theme. You can try other themes to explore the different options.
ggplot() +
    geom_sf(data =boulder,
    fill = NA, alpha = .2) +
    theme_bw()

  1. At the moment, the projection is a bit weird. Let’s project the data using an appropriate projection for Colorado. Use the epsg.io website for choosing the an appropriate projection
boulder = st_transform(boulder, 26753) 
ggplot() +
    geom_sf(data =boulder,
    fill = NA, alpha = .2) +
    theme_bw()

  1. Now we will explore different methods for visualizing this data. We will add ‘Gradient colour scales’ in ggplot2. Here is the documentation of these options https://ggplot2.tidyverse.org/reference/scale_gradient.html.
ggplot() +
    geom_sf(data =boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) +
    theme_bw()

  1. ggplot2 has several gradient colour scale options. The details can be found here.
ggplot() +
    geom_sf(data =boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) +
  scale_colour_gradientn(colours = terrain.colors(10)) +  
  theme_bw()

  1. Let’s look at the locations above 2200 meters. For this we will need to use the ifelse() function. The function basically means if the first argument is true (PT_Elev >= 2200), the elevation is greater than 2200 meter, then print the first varible: TRUE; if not true, print the second varible: FALSE. We use the mutate fucntion to make a new variable in our boulder dataframe. We then use ggplot to plot these locations.
#library(dplyer)
boulder %>%
    mutate(high_elev = ifelse(PT_Elev >= 2200, TRUE, FALSE))%>% 
ggplot() +
  geom_sf(aes(color=high_elev),
    fill = NA, alpha = .2)  +  
  theme_bw()

  1. We can also plot different charts using ggplot. Let’s compare the distance from roads and social media photographs. Here we filter() to analyze social media only. We use a box plot to compare mean distance of these photographs from the nearest road. What does this test?
boulder %>%
  filter(DB ==  'Pano' | DB == 'Flickr') %>%
  ggplot(aes(x=DB, y=Street_dis)) + 
  geom_boxplot()

As you can see there is no significant relationship. The mean values and standard deviation is highly similar. There are numerous other tests and charts that you can use to investigate the relationship between locations of soical media photographs and other locations in the park.

Additional Geovis tools

We are also going to learn about two new packages that might be helpful for your data science approach. We will learn about the library(viridis), which provides color palettes that are interpretable for visually impaired.

The color scale

The package viridis contains four color scales: “Viridis”, the primary choice, and three alternatives with similar properties, “magma”, “plasma”, and “inferno”.

library(sf)
library(ggspatial)
library(viridis)
## Loading required package: viridisLite
## the function gives the hexadecimal colors 
## the interger give the numbers of colors
magma(10)
##  [1] "#000004FF" "#180F3EFF" "#451077FF" "#721F81FF" "#9F2F7FFF" "#CD4071FF"
##  [7] "#F1605DFF" "#FD9567FF" "#FEC98DFF" "#FCFDBFFF"
boulder <- st_read("C:/Users/dbvanber/Dropbox (University of Michigan)/Geovis/Labs/Adv_Week_1/BoulderSocialMedia.shp")
## Reading layer `BoulderSocialMedia' from data source 
##   `C:\Users\dbvanber\Dropbox (University of Michigan)\Geovis\Labs\Adv_Week_1\BoulderSocialMedia.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
ggplot() +
    geom_sf(data = boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) + 
    scale_colour_gradientn(colours = magma(10))

We can also plot discrete values.

summary(boulder$DB)
##    Length     Class      Mode 
##     55519 character character
p <- ggplot() +
  annotation_spatial(boulder) +
  layer_spatial(boulder, aes(col = DB))
p + scale_color_brewer(palette = "Dark2")

tmaps

Alternatively, we can use tmap a way to create maps using R

library(tmap)
## Add the data - these are specific to the vector or raster
tm_shape(boulder) + 
  ## which variable, is there a class interval, palette, and other options
  tm_symbols(col='PT_Elev', 
             style='quantile', 
             palette = 'YlOrRd',
             border.lwd = NA,
             size = 0.1)

It is really easy to add cartographic elements in tmap

## here we are using a simple dataset of the world 
# tmap_mode("plot")
data("World")
tm_shape(World) +
    tm_polygons("gdp_cap_est", style='quantile', legend.title = "GDP Per Capita Estimate")

It is really easy to make an interactive map in tmap as well

## the view mode creates an interactive map
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(World) +
    tm_polygons("gdp_cap_est", style='quantile', legend.title = "GDP Per Capita Estimate")

Advanced Week 1 Lab Assignment

In this week’s lab, you will make an open science markdown that documents your process of data analysis and geovisualization. We will be using git to aid in version control for the code. Your assignment is to use Knitr to develop a markdown document that shows your analysis of the Boulder data (you can also use your own data if you wish). Demonstrate how you did your analysis giving step-by-step instructions with the accompanying code. Include 1 chart and 1 map. Structure and explain your analysis with text, headings, highlights, images and other markdown basics.

Steps for Hosting a HTML of your RMD as a Website on GitHubn (Bonus)

It is rather simple to make your html publicly available via github. Here is an example of one I made for a recent paper https://derekvanberkel.github.io/Planning-for-climate-migration-in-Great-Lake-Legacy-Cities/. Below are the step to make the knit html you make for this lab into a statitic website. Here is another website that give more detail https://blog.flycode.com/how-to-deploy-a-static-website-for-free-using-github-pages

  • Create a GitHub account on github.com.
  • Create a new repository in your GitHub application. Name it your-username.github.io. The name is very important. Note the folder that GitHub is saving the repository to. Make sure the “Push to GitHub?” box is checked. This will be the url of your website. Name it accordingly
  • Move your html file that you knit into the folder that GitHub just created when you made the repository. IMPORTANT: Your homepage HTML file must be called “index.html”, and it must exist in the top-level directory.
  • Enter a message in the text box called “commit summary”, something like “initial commit.” Then, click the commit button. +Now click the Settings button on the repository site. Navigate to the Pages tab and click it. Under branch click the None dropdown and choose main, and save it. +It might take some time (give it about 10 minutes), then check this page to see the url associated with the website. Your website should be live!

Questions

  1. What are the benefits and challenges of an open data science approach? Give an example based on this week’s reading. (1-2 paragraphs)
  2. Knit a markdown document that demonstrates an analysis of this or other data (include: text explaining your analysis, figures and geovisualizations)

Bonus: Include a screen grab of the history of your git commits. What is your strategy for using git?

Here are the evaluation criteria for the geovisualizations. Questions will be worth 30% of your grade, while the geovisualization and explanation will be worth 70%.

Evaluation Highly well-done Well-done Some deficiencies Several deficiencies
Cartographic principles - 20% (title, name, date, north arrow, scale, legend, explanation symbols) Elements present and correctly portrayed (100%) Most elements present and correctly portrayed (99-80%) Some elements (when appropriate) present and correctly portrayed (79-50%) Minimal information (<50%)
Presentation and Legibility - 20% (readable, consistency and ease of understanding, flow of ideas consistent with cognition, clear explanation of content) Highly legible, consistent and easy to understand (100%) Mostly legible, consistent and easy to understand (99 -80%) Somewhat legible, consistent and easy to understand (79-50%) Minimally legible, consistent and poorly understandable (<50%)
Content - 20% (relevant, coherent and interesting topic, appropriate subject matter given the presented information/data, free of bias and error ) Highly relevant coherent, and interesting; consistent information free of bias and error (100%) Mostly relevant coherent, and interesting; consistent information free of bias and error (99 -80%) Somewhat relevant coherent, and interesting; some inconsistencies in information(79-50%) Minimally relevant coherent, and interesting; inconsistencies in information (<50%)
Aesthetics - 20% (is the map attractive, are there objective elements that are popularly viewed as beautiful) Highly attractive/ beautiful (100%) Mostly attractive/ beautiful (99 -80%) Somewhat attractive/beautiful (79-50%) Minimally attractive beautiful (<50%)
Creativity and persuasiveness - 20% (imaginative information/data, convincing argumentation, presence of sustainability principles) Highly imaginative; convincing of sustainability principles (100%) Mostly imaginative; convincing of sustainability principles (99 -80%) Somewhat imaginative; less convincing of sustainability principles (79-50%) Minimally imaginative; not convincing of sustainability principles (<50%)

Git hub (optional instruction)

Git is an open-source version control system that was developed by Linus Torvalds (the same person who developed Linux). What is version control? When we create some code, we are constantly changing it. Version control systems keep ‘versions’ of these change(you actually make the version description) in a central repository. This can help with collaboration, as everyone can download a new version of the software, make changes, and upload the newest revision. Every developer can see these new changes, download them, and contribute. Similarly, people who have nothing to do with the development of a project can still download the files and use them. Git is the preferred version control system of most developers, as it stores file changes more efficiently and ensures file integrity.

Github consists of repositories (“repo”), which store all the files for a particular project. Each project has its own repo, and you can access it with a unique URL. You can create a new project based off of another project that already exists, which is called “forking”. This encourages the further development of programs and other projects. For example, if you find a project on GitHub that you find useful, you can fork the repo, make changes, and release the revised project as a new repo. If the original repository that you forked to create your new project gets updated, you can easily add those updates to your current fork. You can also create a pull request. The pull request allows the original author to see your work, and then choose whether or not to accept it into the official project. Each user on GitHub has their own profile that acts like a resume of sorts, showing your past work and contributions to other projects via pull requests. Project revisions can be discussed publicly, so a mass of experts can contribute knowledge and collaborate to advance a project forward. When multiple people collaborate on a project, it’s hard to keep track revisions — who changed what, when, and where those files are stored. GitHub takes care of this problem by keeping track of all the changes that have been pushed to the repository.

Workflow and terminolgy

Credits: GutHub repository and Rahul Agrawal

Master

GitHub uses the term “master” or “main” for the primary version of a source code repository. Developers make copies of the “master” on their computers into which they add their own code, and then merge the changes back into the “master” repo.

Branch

Branches allow you to develop features, fix bugs, or safely experiment with new ideas in a contained area of your repository.

You always create a branch from an existing branch.Typically, you might create a new branch from the default branch of your repository. You can then work on this new branch in isolation from changes that other people are making to the repository.

In practice, you can get help with a branch and work independently on it.

You document these change with new commit message e.g. “testing new code for this map”

This allows you to go back to specific code stages, for example, if you made change that didn’t work

There is also the potential for automated testing in git.

When you happy with the changes you can merge back into the master

GITHub from command line (it’s easier than R studio!)

To start, create a GitHub account here

Once you have an account you can create your first repository

Choose the details of the repository. The name is the folder that you will download and where the source code will be stored (make sure to make a unique to ensure that you don’t overwrite anything when cloning). You can optionally add a description of the repository. Next choose whether you want it to be private or public. The public option allows anyone to clone your code. It can be a good way to document and advertise your work. The private option allows you to share with specific developers, while keeping your work closed to the public. This might be the preferred option if your work is sensitive. The option to add a README is nice if the repository is public. You can add a markdown elements to help explain your code. The .gitignore is a list of file names that will not be tracked when committed to GitHub (e.g. files that can exist in the repository directory that will be ignored). The license choice is important as this is a public repositories where data is often shared. For your repository to truly be open source, you’ll need to license it so that others are free to use, change, and distribute the software.

Make your choices for the repository.

Now we have a repository. Take note that it has the details that we added. It has the master and branch description as we discussed above.

Now we can clone this repository to our local machine. Hit Code button and choose to copy the url for cloning the repository.

Now we are going to use Command line (PC) or Terminal (Mac) to clone the repository to our computers. Both PCs and Macs should have the required software for this operation(but you might need to download git in some cases - Windows & OS X: http://git-scm.com/downloads). See additional instructions below. To start, we will navigate to the directory where we want to clone the repository. The code below allows you to change directories to your desired location:

PCs cd C:\Data\tmp

Macs cd ~/Desktop/sass/css

You can use ls to list the files and directories on the directory where you just navigated. This is helpful to know as you will adding additional files to this directory from GitHub. Now we can use the git clone command to clone the repository into this directory. If you have never connected to your Github repository, you will be prompted to fill in your user id and password. This may work, but alternatively you might need to get your api key and fill in this information. Instructions can be found here. Use the API key as the password:

If you have already added the API key, you won’t be prompted for your username and password

Now if you navigate to the folder that you cloned, you will find the directory(in this case Temp) and file (.gitignore and README.md). Any changes to the directory are now “version controlled”, and can be pushed and committed to the GitHub. This essentially allows us to track changes and document this systematically in the event that we need to backtrack

We can also add files to the master that will later be included in versioning. Let’s say we are working on an R studio file. If we add this to our local folder this can be merged with the master using the git add command.

This queues up a file to be added to the GitHub repository. You can continue this process with the same, or different files, queuing up what you want to add to the repository. This can only be “committed” when you use the commit command and add some text describing this change. This text helps you remember at what phase of the project you committed the file.The -m or merge command add the files indicated to the master. First change the directory to the actual repository

PCs cd Temp

Macs cd Temp

Now, specify the file that you want to add using the file name i.e. git add ReadSF.R. When you are ready to actually commit this to the repository, you add some text after the -m or merge flag using git commit -m "Add the R file". Finally you use the git push origin to send the file to GitHub.

Now if we work on this .R file, git will know that we have made change and this will be recorded. Remember that the .R needs to be saved for it to be recoreded

Using the command git status shows us any possible changes

For these changes to be “registered”, we will need to follow the same procedure above that names and pushes the changes.

First add the use the git add . command to stage these new changes for committing. We can use git stutus to check if this is correct.

The ReadSF.R changes have now been added as indicated by it the file name turning green. We can now commit it using a commit

`git commit -m “new variable subset”’

and

git push origin

Developing in a Branch

Sometimes you want to develop in a “branch”. This isolates the code from changes that you might not want to include in the master. You can add a branch using the function git branch [branch name]. To start let’s examine the branches using git branch -a. The -a flag indicate that you want to print all the branches. Since we only have a master branch this should be the only one printed. The branch you working on is indicated by a *.

If you want to add a branch, you use the git branch function, naming it what you like.

Now that we have created a branch, we can start working in this branch using the git checkout [branch names]

Now any changes made to the .R file will be “recorded” in the branch and not in the master. Changes can be merged into the master using git merge [branch name] in the master.

GitHub for R studio

Once you’ve installed your preferred Version Control system, you’ll need to activate it on your system by following these steps:

  1. Go to Global Options (from the Tools menu)
  2. Click Git/SVN
  3. Click Enable version control interface for RStudio projects
  4. If necessary, enter the path for your Git or SVN executable where provided (Navigate to the git program for this). You can also create or add your RSA key for SSH if necessary.

Now we can start version control locally. First we start a new project in File > New Project.... This will close any tab that you have open in Rstudio, so be sure to save them. You will now click New Directory > New Project. Type in the name of the directory to store your project, e.g. “my_Geovis_project”. Select the checkbox for Create a git repository and click the Create Project.

In this new project, a tab should appear showing that git is enabled.

Now let’s start version control. Add a function (e.g. the ggplot below).

ggplot() +
    geom_sf(data =boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) +
  scale_colour_gradientn(colours = terrain.colors(10)) +  
  theme_bw()```

Hit the git tab and check the box next to this markdown Rmd. Hit commit to show the commit window.

Here we will describe the commit so we can remember what we were doing at this point in time. After the commit, you can close the windows.

Now we will change the code in some way, and record the new version of the document. Let’s add the code for reading in the boulder data and save the new version. Repeat the commit process by clicking the checkmark next to the file you working on and hitting commit. Give this newer version of the code a new description. Notice that the new line of code is in green indicating that this has not been committed. After the commit, you can close the windows.

boulder <- st_read("C:/Data/Labs/Geovis/Adv_Week_1/Locationdata.shp")
ggplot() +
    geom_sf(data =boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) +
  scale_colour_gradientn(colours = terrain.colors(10)) +  
  theme_bw()

Let’s deomstrate how we can revert to an older version of the code. To do this, hit the Diff button and navigate to the History tab. The SHA code identifies the different versions that you have committed. Choose the SHA code that you want revert to and copy it.

To actually do the reset we will be using the terminal window in R Studio. If this is not already a window tab, activate it using Tools > Terminal > New Terminal. We will reset the older version by typing git reset --hard [your SHA] in the terminal(use the right button on your mouse to paste the SHA). You will need to use the SHA code specific to your version for the reset. This will reset to an older version. Remember that a hard reset will revert to that stage, meaning that previous stage version will be lost.